Warehousing Structured and Unstructured Data for Data
نویسندگان
چکیده
More data, especially unstructured data, is available to users than ever. There is so much data available that it is diicult for users to make use of their data in its raw form. To handle the diversity of data types, we have designed and prototyped a multidatabase/warehouse system. The system has been especially designed to facilitate the interaction of structured and unstructured data. The system makes use of object oriented views. The main features of the view mechanism, especially as they relate to textual documents, are presented in the paper. The system is designed to take target documents either from large repositories or from the Web. Issues for both sources of documents are examined in the paper. The paper also looks at how the view approach allows the interaction between the data taken from structured (e.g., relational), semistructured (e.g., object oriented) and unstructured (e.g. text) data sources. The warehouse support provided by the system is brieey examined and the paper concludes by looking at our approach to data mining and how the system will operate in the complete environment.
منابع مشابه
Text Analytics to Data Warehousing
─ Information hidden or stored in unstructured data can play a critical role in making decisions, understanding and conducting other business functions. Integrating data stored in both structured and unstructured formats can add significant value to an organization. With the extent of development happening in Text Mining and technologies to deal with unstructured and semi structured data like X...
متن کاملTowards Business Intelligence over Unified Structured and Unstructured Data Using XML
Traditional data warehousing has been very successful in helping business enterprises to make intelligent decisions through declarative analysis of large amount of structured data stored in a relational database. However, not all enterprise data naturally fit into a relational model. Within an enterprise, there are huge amount of unstructured data, such as document content, emails, spreadsheets...
متن کاملFrom data warehousing to active information integration systems
Enterprises have gathered operational business information frommultiple structured data sources and stored it in a central repository, called data warehousing, for decision support functionalities and data analysis. The enterprises are now realizing to integrate their entire information sources, including "unstructured" contents, for deeper and richer information analysis. Several applications,...
متن کاملTowards Complex Data Warehousing : A new approach for integrating and modeling complex data
With the development of Internet, the availability of various types of data (multimedia, data from databases, ) has increased. These data which present different forms and semantics (we name them ”complex data”) may be unstructured, structured, or semi-structured. In order to prepare them for proper analysis, a multidimensional modeling layer is needed. To be efficient in terms of quality of se...
متن کاملAn MAS-Based ETL Approach for Complex Data
In a data warehousing process, the phase of data integration is crucial. Many methods for data integration have been published in the literature. However, with the development of the Internet, the availability of various types of data (images, texts, sounds, videos, databases...) has increased, and structuring such data is a difficult task. We name these data, which may be structured or unstruc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997